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The rank envelope test (Myllymaki et al., Global envelope tests for spatial processes, 
arXiv: 1307.0239 [stat.ME]) is proposed as a solution to multiple testing problem for 
Monte Carlo tests. Three different situations are recognized: 1) a few univariate Monte 
Carlo tests, 2) a Monte Carlo test with a function as the test statistic, 3) several Monte 
Carlo tests with functions as test statistics. The rank test has correct (global) type I error 
in each case and it is accompanied with a p-value and with a graphical interpretation which 
shows which subtest or which distances of the used test function(s) lead to the rejection 
at the prescribed significance level of the test. Examples of null hypothesis from point 
process and random set statistics are used to demonstrate the strength of the rank envelope 
test. The examples include goodness-of-fit test with several test functions, goodness-of-fit 
test for one group of point patterns, comparison of several groups of point patterns, test of 
dependence of components in a multi-type point pattern, and test of Boolean assumption 
for random closed sets. 

KEY WORDS: Anova, Boolean model test, Envelope test, Extreme rank ordering, 
Goodness-of-fit test, Multi-type point process, Permutation test, Rank envelope test. Sim¬ 
ulation, Superposition hypothesis 


1. INTRODUCTION 


Nowadays, Monte Carlo tests are used in many applications. In particular, these tests are used in 
fields where no analytical results are usually available. One such field is spatial statistics. In our 
work, we concentrate mainly on a subfield of spatial statistics, spatial point processes. In a Monte 
Carlo test, a test statistic T is chosen and the statistic estimated from the data (Ti) is compared 
to s simulated statistics obtained from simulations under the null hypothesis (T 2 ,..., T s+ i). If 
the data statistic Ti takes an ex tie me rank among all the statistics, the null hypothesis is rejected. 
This kind of Monte Carlo test was introduced by Barnard (1963) and popularized for spatial point 
patterns by Besag and Diggle (1977). Throughout this paper we will consider this type of Monte 
Carlo test only. 

The chosen test statistic T can be univariate or multivariate. This paper considers the multi¬ 
variate case where the components of the vector are generally dependent. In this case, the usual 
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way to perform a test is to transform T to a univariate case, in order to avoid the multiple testing 
problem which arises since one wants to base the test on all m > 1 components of T. Such a 


solution is called deviation test in point process statistics. In these deviation tests (see e.g. Illian 


et al. 2008}[Myllymaki et al. 2015a l, the (scaled) maximum or integral measure of all components 


is used as a transformation measure. In this work, we consider another solution to overcome the 
problem of multiple testing, which is the extreme rank ordering ( Myllymaki et al.[2015b l. This 
ordering gives exactly the same weight to every component of T and it provides graphical visu¬ 
alization for all components, which is seen by practitioners as a great advantage. The rank test is 
the Monte Carlo test based on the extreme rank ordering. 


Another solution to multiple testing problem is to use Bonferroni-type corrections (Simes 


1986 Hommel|1988t Hochberg|1988[|Rom|1990l. Such methods are rather conservative espe¬ 


cially in the case where large number of dependent tests are considered. Further, Benjamini and 


Hochberg (1995) introduced a method based on controlling false discovery rate, which weakly 


controls the global type I error. On the other hand, the rank test considered in this paper is exact 
(in the sense of correct type I error under a simple null hypothesis) for any number of Monte 
Carlo tests either dependent or independent and any number of used simulations s. 

The rank envelope test was first introduced by Myllymaki et al. (2015b l for the case where 
T is a test function, which is in practice discretized to a high dimensional test vector. |MyllymaE] 


et al. (2015b) used the rank test for goodness-of-fit testing of point process models. This test pro¬ 


vides both an exact p-value and graphical visualization. The graphical visualization is given by 
the 100%(1 — a) simultaneous envelope which has the intuitive meaning: If the data test vector is 
outside the simultaneous envelope (at least for one component of the test vector) the null hypoth¬ 
esis is rejected at the prescribed level a. In this paper, we generalize this idea for a general test 
vector of any dimension. Especially, we show how this procedure can be used 1) for a test vector 
with low number of components, 2) for a test vector with many highly correlated components 
(suggested in Myllymaki et al.[|2015b l and 3) for a test vector with almost independent blocks 
with high inner correlation. The last case covers a rank test based on several test functions, i.e. 
combining several rank tests where each rank test is performed on a different test function. It also 
allows to make a post-hoc comparison of such a combined rank test. We show several possible 
usage of the rank test in these situations by examples taken from spatial statistics. 

First of all we show, in Section 4, the use of low dimensional test vector as a solution of 
multiple testing problem for goodness-of-fit test of Boolean model. 

In Section 5 we use the rank test to solve the multiple testing problem of several goodness- 
of-fit tests performed with different test functions on the same data. Usually it is not known in 
advance which test function is sensitive to reveal deviations from the given null hypothesis. If 
one wants to use several test functions, then the combined rank test can be employed to obtain 
one common p- value for the combined test. The graphical visualization is possible and the simul¬ 
taneous envelope is given jointly for all test functions (on the global level a). We also perform 
a simulation study to explore how the power is affected by the use of several test functions in 
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comparison to using only one. 

In Section 6, we consider a goodness-of-fit test for replicated point pattern data and show how 
the combined rank test can be used to overcome the multiple testing problem. The combined rank 
test provides one common p-value and also identifies which of the point patterns is the reason 
for the (potential) rejection of the null model. Testing and identification are made on the global 
level a. 

In Section 7, we discuss the problem of comparing several groups of point patterns by com¬ 
bined rank test, which leads to a kind of nonparametric functional ANOVA. An advantage of the 
combined rank test is that it immediately provides graphical post-hoc comparison, which is done 
on the global level a. 

In Section 8, we address the problem of dependence of components of a multi-type point 
pattern with more than two types. 

The rank test and also its graphical visualization is first explained in detail in Section 2. In 
Section 3, the number of simulations needed to perform the rank test is discussed. Section 9 is 
devoted to further discussion. 

The proposed method is provided in an R library spptest, which can be obtained at 
https://github.com/myllym/spptest. 


2. MULTIVARIATE MONTE CARLO TESTS BASED ON POINTWISE RANKS 


The idea of the multiple Monte Carlo testing considered in this paper is based on the rank envelope 
test introduced in Myllymaki et al. ( 2015bj ). In the mentioned paper the rank envelope test was 
considered in detail for a functional test statistic, which is typically an estimator of a (well-known) 
summary function. Here, in the present paper, we extend this idea into general multivariate vector 
of the form 

T =(T 1 ,...,T d ). 

This extension enables us to consider various test hypothesis, which are not covered in the 
original case of Myllymaki et al. (2015b). In Section[4j T consists of only a few measurements 
of intrinsic volumes on a random closed set, in Section [5] values of several different summary 
functions estimated on a point pattern are combined into one vector, in Sections [6] and [7J the 
vector consists of estimates from the same summary function on several patterns. Finally, in 
Section [8] the vector consists of estimates of different summary characteristics of a multivariate 
point pattern. 

Further, we define also a one sided test, whereas in the previous work only two sided tests 
were considered. Although, the extensions considered in this Section are rather straightforward, 
we briefly define these extensions. 
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2.1 Rank envelope test 


Let Ti be the observed statistic, and To i ■ ■ ■ > Ts+i be a sample of s realizations of T under the 
null hypothesis. The rank envelope test ( Myllymaki et al.||2015b l with level a constructs a set 
{TiowjTupp} of envelope vectors such that, under the simple null hypothesis, the probability 
that Ti = (Tn, • • •, Tid) falls outside this envelope in any of the d points is less or equal to a, 


Pr (Tiy t Plow j ’ ^upp j] for an y 3 \ Ho) < 


a 


and the probability that Ti falls outside this envelope or touches it in any of the d points is greater 
than a, 

Pr (Tiy i (T^ TfoJ •) for any j \ H 0 ) > a 

In goodness of fit tests, the realizations T 2 ,..., T s+ i are independent (or at least exchange¬ 
able), and generated by simulation. There is also a possibility to generate the realizations by 
permuting samples of the observed data. Such tests are often called permutation tests. Simula¬ 
tion based tests are dealt with in Sections [4|[5][6j and[8] while Section[7]uses permutation. 


2.1.1 Calculation of p-values 

The test is easiest to understand from perspective of the associated p-values. According to the 
framework of Barnard’s Monte Carlo test or Pitman’s permutation test, p -values are obtained by 
assigning an extreme rank Ri to each of the vectors T f , such that the lowest ranks correspond to 
the most extreme values of the statistic. The conservative and liberal p-values are then given as 

s+l s+l 

p+=HRi < Ri)/ ( s +1), p-=y~] 1 (Ri < R\)/(s + 1 ). (i) 

i =1 z=l 

The extreme rank Ri of the vector T, is the minimum of the pointwise ranks R , rJ , ;j = 1..... ci of 
its elements of T ;/ among the corresponding elements T\ r T-^j ,..., T^ s+1 y in all s + 1 vectors, 


Ri = minify. (2) 

3 

How the element wise ranks are determined, depends on whether a one sided or a two sided test is 
to be performed. Let ry,r 2 j, ■ ■ ■, ry s+ 1 be the raw ranks of T\.j. T 2 V ..., T( s +i)j, such that the 
smallest Tij has rank 1. In the case of ties, the raw ranks are averaged. The resulting pointwise 
ranks are calculated as 


Rij — < 




s + l - nj, 
min(rjj, s + 1 - nj), 


one-sided test, small T is considered extreme 
one-sided test, large T is considered extreme 
two-sided test. 


(3) 
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2.1.2 The graphical envelope test 

For the graphical version of the test, an appropriate low rank is determined as shown below, 
and the envelope is constructed as the hull of those “less extreme” vectors T, that have rank 
Ri > R( a )- Let I a = {i e 1, ..., s + 1 : Ri > R( a )} be the index set of vectors, and define 


t/ 0 '* • = minT,-,-, 

low ^ iei a 13 ’ 


_ rj~i 

T upp j = maxTij 


itzlcx 


(4) 


->(“) 


for the two sided test. For one-sided tests, let = -oo or r, 

choosing R^ as the smallest value in {i?i ; 


(«) 


lowi - - - - -uppi = °°> respectively. By 
, i? s +i} for which 


s+l 


^ 1 {Ri < R(a)) > a ( s + 1), 


(5) 


2=1 


we get the following interpretation. 

If the observed vector leaves this envelope in some point i.e. R\ < R( a y which is equivalent 
to p + < a, the null hypothesis is rejected. If the observed vector is completely inside this 
envelope i.e. Ri > R( a ), which is equivalent to p_ > a, the null hypothesis is not rejected. If 
the observed vector coincides in some point with the border of this envelope, i.e. Ri = R( a ), 
which is equivalent to p_ < a < p + , the rejection of the null hypothesis remains undecided. 


The above interpretation is a direct consequence of Theorem 4.2 in Myllymaki et al. (2015b) 


for the two sided rank envelope test. The proof of this theorem can be done in the same way also 
for the one sided rank (envelope) test. 


2.2 The problem of ties, and p-intervals 

If the extreme ranks Ri were almost surely different, the p-values of the global envelope test 
would take the values l/(s+l),2/(s+l),...,l with equal probability under the null hypothesis. 
However, due to the construction as vector wise minimum of pointwise ranks ([2]), ties occur very 
often. In a one sided test with d-variate vectors, up to d out of the s + l vectors could take rank 


1. Therefore instead of a single p- value, Myllymaki et al. (2015b) suggest to accompany the test 
with a /^-interval (p_, p + ]. The length of the p-interval. 


1 


S+l 


p+-p- = 


S + 


I'EHRi = Ri), 


2=1 


determines the "grey" zone of the test. It was shown in Myllymaki et al.j( j20l5b| > that this length 
is of order .+ 1 . However, it also depends on the correlation structure of the multivariate vec¬ 
tors. In Section[3j we investigate the needed number of simulations s with respect to the type of 
correlation structure of the multivariate vector. 
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2.3 Breaking ties by extreme rank count ordering 


To resolve the problem of "grey" zone of the test, Myllymaki et al. (2015b) defined extreme rank 
count ordering which refines the extreme rank ordering in order to minimize the possibility of 
ties. We briefly rephrase the definition in the multivariate vector case. 

Consider the vectors of pointwise ordered ranks R, = (11,1 , R^], ■ ■ ■, Ri[d\), where 
{Ri[ l], ■ ■ ■, Ri[d]} = {Rn, • • •, Rid] and R i{j] < R l[jr] whenever j < f. 

The extreme rank given in Q corresponds to II, = R,\\. It was suggested to replace this by 
the rank under extreme rank count ordering of the vectors R,, namely to consider ordering based 


on 


s+l 

= £ HR*' 

i '=i 




( 6 ) 


where 


R* A R?:* 


3 71 £ (I . R,^ Ri'\j]Vj ^ : R/ [re] ^ R‘i' [n] • 


2.4 The type I error 


The possibility of ties in the extreme rank count ordering is rather small, therefore the rank (en¬ 
velope) test with extreme rank count ordering as a solution for the ties has the exact type I error 
under simple null hypotheses in practice. 

In the case of composite null hypothesis, where some parameters of the null model have 
to be estimated, the test is usually conservative. The amount of the conservativeness depends 
on the correlation of the estimating and testing functions and on the precision of the estimation 
procedure. However, the test can be instead liberal, if the estimation procedure is biased as it is 


shown in Section 4. Myllymaki et al. (2015b I showed the possibility of applying the procedure 


described in Dao and Genton (2014) on rank envelope test. This adjusted rank test corrects the 
type I error of the test under a composite hypothesis, but it is rather time consuming procedure, 
because it requires s 2 simulations. A composite null hypothesis is tested in Section 4, where 
the adjusted rank test is applied. The composite null hypothesis is also tested in a part of the 
simulation study of Section 5. There, we use only pure rank test due to the time constraints. Note 
that this simplification does not influence the conclusions made from the simulation study. 


3. APPROPRIATE NUMBER OF SIMULATIONS 

Note first that the rank test with extreme rank count has exact type I error under the simple null 
hypothesis with whatever number of simulations. Only the precision of the graphical interpreta¬ 
tion, which is given by a width of p-interval, can be unsatisfactory. In this section, we give some 
recommendations for the number of simulations s for the common choice of the significant level, 
a = 0.05. For this significant level we would like the width of p-interval to be 0.01 at maximum. 
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3.1 Number of simulations for a low dimensional test vector 

Having a test vector of dimension d, the maximal width of the p-interval is simply 



2d/(s + 1) 

(7) 

for a two-sided rank test and 

for a one-sided rank test. 

d/(s + 1) 

(8) 


Usually, there is low correlation between components of the test vector when d is small and, 
therefore, the above formulas can be used to determine the appropriate number of simulation in 
this case. In the case of a high dimensional test vector, the formula ([7]) or ([8]) gives an upper limit 
for the width. However, the choice of s based on this upper limit would be too time consuming. 


3.2 Number of simulations for a test function 


A test function is in practice a high-dimensional test vector with high correlation between the 
components. (In our studies, we have used d = 500.) Due to the high correlation the width of 
the p-interval is much smaller than the upper limit given by ((TJ) or ([8]). In our previous study 
( Myllymaki et al.j2015b l, where only this case was studied, it was recommended to use at least 
2500 simulations when testing at the significance level 0.05. 


3.3 Number of simulations for a combination of several test functions 

The rank test for combination of several test functions can be seen as a two stage procedure. The 
first step is to compute the extreme rank ordering for each sub-test, i.e. for each test function 
separately. The global extreme rank Ri is then the minimum of the extreme ranks from the sub¬ 
tests. Thus the second step can be seen as a one-sided rank test performed on the extreme ranks 
computed in sub-tests. Since, generally in different sub-tests, different simulations contibute 
to the most extreme rank, the recommended number of simulations for a combination of k test 
functions is k * 2500. 


3.4 Number of simulation for the rank test with extreme rank count as a solution for ties 

The extreme rank count is practically a continuous test statistic. Therefore the probability of 
having a tie in extreme rank counts is very small and can be disregarded. This means that it is 
possible to use extreme rank count ordering with less simulations than the extreme rank ordering, 
but then the graphical envelope interpretation may be lost, because the data function may coincide 
a boundary of the envelope. 

We remark here that classically in a Monte Carlo test the p-value is estimated from the given 
simulations of the test statistics (tq in the deviation test, /?, in the rank test). The standard de¬ 
viation of such a point estimate decreases with the square root of the number of simulations 


7 







Mrkvicka, Myllymaki, Hahn: Rank correction to multiple testing (June 5, 2015, 00:24) 


8 / 


28 


performed. Loosmore and Ford (2006) recommended to use at least 999 simulations to reduce 
this uncertainty. 


4. TEST WITH USE OF LOW DIMENSIONAL RANDOM VECTOR 


In this section, we demonstrate the rank test for combining several univariate Monte Carlo tests 
together. As an example, we use the Boolean model of disks (see e.g. Stoyan et al.|1995[ Mrkvicka] 


2009]) as a null model for an image of mammary cancer tissue, see Figure [T] which is regarded 


as a random closed set. This data originate from a collection of 200 images studied in detail in 


Mrkvicka and Mattfeldt (2011), 


We chose the distribution of disk radii in the Boolean model to be log normal and estimated 
the parameters of the model by means of the contact distribution function (Molchano v|1995 1. A 
realization of the fitted process is shown in Figure [T] (The difference of the data and fit is mainly 
in the shape of sets, because the fitted model use only disks, therefore the chosen test statistics 
are not heavily dependent on the shape of sets.) Next, in order to conduct a test, simulations were 
generated from the fitted null model. 



Figure 1. Left: A binary image of mammary cancer tissue with resolution of 512 x 512 pixels. Right: A 
realization of the fitted Boolean model of disks with lognormal distribution of disk radii. 


As test statistics we choose intrinsic volumes, because they are the most important charac¬ 
teristics of random closed sets and because they are not related to the characteristic used for 
estimation, i.e. the contact distribution function. In (-dimensional Euclidian space, the intrinsic 
volumes Vo(K ),..., Vi(K) of a convex body K C M. 1 are determined by the Steiner formula 

i 

Vi(K e ) = J2 £kuj kVi- k (K), 

k =0 

where V) is the volume ((-dimensional Lebesgue measure), I\ £ = {i G : dist ( x , K) < e} 
the (closed) e-parallel set to K and iOk denotes the volume of the unit ball in M fc . (Under a different 
normalization, they are known as quermassintegrals or Minkowski functionals.) The intrinsic 


























Mrkvicka, Myllymaki, Hahn: Rank correction to multiple testing (June 5, 2015, 00:24) 


9/ 


28 


volumes can be extended additively to polyconvex sets (sets from the convex ring), for details see 


Schneider (1993). In the plane, is the volume, V\ (K) is one half of the circumference of 


the border OK and Vq(K) is the Euler number. 

We then performed the rank test with s = 299 simulations where the test vector is three- 
dimensional consisting of all three intrinsic volumes. The resulted p-interval is (0.003, 0.02) and 
the p -value based on the extreme rank count ordering is 0.013. Since we deal with composite 


hypothesis, we performed also adjusted rank test with s(s + 1) simulations (Myllymaki et al. 


2015b l. The resulted graphical interpretation is shown in Figure[2] The envelope for the adjusted 


test is wider than for the pure test, which refers to the estimation procedure not being the perfect 
one. (The histogram of p -values inside the adjusted test shows great preference for small p -values 
as well.) The resulted adjusted significance level a* = 0.013 which leads us to the rejection of 
the null hypothesis at the significance level 0.05. Furthermore, the graphical interpretation shows 
that the null hypothesis is rejected due to the Euler number which lies on the adjusted envelope, 
i.e. the data set has more isolated cells than the Boolean model. 


Adjusted rank envelope test 



V2 


2 

10*V1 


3 

1000*V0 


Figure 2. The outer bounds show result of the adjusted rank envelope test with 299 * 300 simulations 
of the null model where the test vector consists of all three intrinsic volumes, whereas the inner bounds 
show result of the rank envelope test with 299 simulations. The crosses correspondes to the data intrinsic 
volumes. 


5. GOODNESS-OF-FIT TEST WITH MANY TEST FUNCTIONS SIMULTANEOUSLY 


Deviation and envelope tests are the main tools for goodness-of-fit tests in spatial point pro¬ 
cess statistics (see e.g. Illian et al.|2008 ; Diggle|2QI3| Myllymaki et al.|2015a| Myllymaki et al. 


2015b). These tests are based on a test function T(r ) on a chosen interval / of distances r. 


For a test, one needs to choose T(r). Previous experience and possible alternative hypotheses 
may suggest a test function to be used. However, often it is not known in advance which test 
function leads to a powerful test in the situation at hand, and one would like to employ several 
test functions, which however leads to multiple testing as such. The rank test can be used in this 
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situation both for combining several deviation tests to one test and for combining several rank 
envelope tests ( Myllymaki et al.|[2015b[ ) to one test. 

In the deviation test, the discrepancy between the (empirical and simulated) test functions 
Ti(r), i = 1..... s + 1, and the expectation of T(r) under the null hypothesis Hq are summa¬ 
rized in single values m, i = 1,..., s + 1, by a deviation measure, e.g. the (scaled) integrated 
discrepancy of all distances on I or the (scaled) maximum discrepancy over the distances on I, 
Illian et al. (20081 and [Myllymaki et ak (|2015ai. If the data value u\ obtains an extreme 


see 


rank among all the uts, the test leads to rejection of Hq. To combine several deviation tests to 
one test by means of the rank test, the test vector T,; is taken to consist of the deviation measures 
uj,uf ,..., uf, where d is the number of deviation tests, i.e. the number of test functions used. 
Thus, we arc dealing with a low dimensional test vector in the rank test as in Section[4j This rank 
test is one-sided, since only large values of u are typically considered significant. 

There is also a graphical interpretation available for the classical ( |Ripley|| 1981 1 and scaled 
Myllymaki et al.j(|2015b]> maximum deviation measure test. Unfortunately, we loose such graph¬ 


ical interpretation in combining several test functions together. We obtain only graphical inter¬ 
pretation for the combined rank test, telling which test function leads to the possible rejection of 

H 0 . 

In the combined rank envelope test, the test vector is taken to consist of all values of the first 
test function followed by all values of the second test function, etc. Thus, the length of the test 
vector becomes d x K, where d stands for the number of test functions and K for the number of 
distances r (in our simulation study below K = 500). We consider the same number of distances 
r for each test function in order to ensure that each test function has the same importance in the 
global test. The rank test is two-sided. 

In combining several rank envelope tests, we have the graphical interpretation for each indi¬ 
vidual test, which is a great advantage of the combined rank envelope test in comparison to the 
combined deviation test. 


In the study of Schladitz et al. (2003), the spatial structure of intramembranous particles was 
investigated separately by the L-, F-, G- and J-functions (see e.g. Illian et al. 20081. It was 
pointed out that some features of the spatial structure which are not visible by L-function can 
be visible by G-function. We use this data study to show advantages of our rank test. Figure [3] 
shows spatial locations of intramembranous particles taken from first sample of untreated group 
of the study of Schladitz et al. (2003). We investigated a Gibbs hard core model (i.e., the Strauss 
process where the interaction parameter equals zero, see e.g. Stoyan et al .|1995 1 as a null model 
for these locations of particles, as was done in |Myllymaki et ak (2015b) using one test function. 
We conditioned the model on the number of points, and fixed the only parameter, i.e. the hard 
core, to the minimum distance between two particles, i.e., 5.85 pixels in our sample, thus dealing 
with a simple hypothesis. 

The combined 95% envelope is shown in Figure[4| The test reveals deviation of the data from 
the null model for both small and medium values of r. But the deviation is proved by L and J 
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functions for small r while it is proved by G and J for medium r. The deviation for small r is 
probably coused by the fact that the particles have variable size. While the deviation for medium 
r is caused by clustering of particles for these distances. 



Figure 3. The first point pattern of intramembranous particles from control untreated group observed in 
a window 512 x 512 pixels. 


Rank envelope test: p-interval = (0, 0.019) 



(L(r)-r)/5 " " F(r) “ “ “ G(r) “ “ J(r)-1 


— Data function.Central function 

Figure 4. The combined rank envelope test with L(r ) — r, F(r), G(r) and J(r) functions performed with 
s = 9999 simulations of the null model. 


Rank envelope test: p-interval = (0.019, 0.022) 
Alternative = "greater” 



12 3 4 

L(r)-r F(r) G(r) J(r) 
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Figure 5. The rank test used for the combination of four scaled deviation measures of L(r) — r, F(r), 
G{r) and J(r) functions performed with s = 999 simulations of the null model. 


Further, Figure [5] shows the rank test used for the combination of four scaled (asymmetric 
quantile, see Myllymaki et al. 2015a) maximum deviation measures of L(r ) — r-, F(r)-, G(r)- 
and J(r)-functions for testing the same situation as in Figure [4] We can see that the rejection of 
this test is due to L-, G- and J-functions, but based on this test we do not know the reason of 
rejection. 

We investigated by a simulation study how the power of the rank envelope test and the scaled 
maximum deviation test ( Myllymaki et al.|2015aj is affected by using one or more test functions. 
Since the results for the deviation test were similar to those of the rank envelope test, in the 
following we show and discuss only the latter. 


5.1 Simulation study 


In our previous study Myllymaki et al.|( |20l5b l, we studied the power of rank envelope and devi¬ 
ation tests with L- and J-functions. We found out, as expected, that which of the test functions 
is more powerful depends on the null and true models. Now we extend the previous study to the 
tests with several tests functions. We would like to show, that the empirical type I error probabil¬ 
ity stays at the desired level in the combined test and that the power of the combined test is not 
much smaller than the power of the test with the most powerful test function. 

In addition to L- and J-functions, we add to the study the empty space function (F-function) 
and the nearest neighbour distribution function (G-function). As tests functions we use standard 
estimators of these summary functions (see e.g. Illian et al. |2008j ). 


5.1.1 Design of the study 

We used the following point process models 

1. Poisson process with intensity A, i.e. complete spatial randomness (CSR), 

2. Strauss(/3, 7 , R) process, where R > 0 is the interaction radius, and J > 0 and 0 < 7 < 1 
control the intensity and the strength of interaction, respectively, 

3. Matern cluster process MatClust(A p , R,/, pd), where X p is the intensity of parent points, 
and R,i and p,/ specify the cluster radius and mean number per cluster for the daughter 
points, 

4. non-overlapping Matern cluster process NoOMatClust(A p , Rd, Pd, R), where the parent 
points follow a hard-core process with hard-core distance R (i.e. StraussfJ, 0, R) process), 
and 
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5. mixed Matern cluster process (MixMatClust), which is a superposition of two Matern clus¬ 
ter processes. 


For further details on the processes see e.g. Illian et al. (2008). The first three processes were 


used as null models. The employed null and true models are specified more precisely below in 
Tables Q] and [3j 

The chosen true model was simulated 1000 times in the unit square. Then the parameters 
of the null model were estimated for each simulation. For the CSR hypothesis, only the inten¬ 
sity of the tested point pattern was estimated. The parameters of the Matern cluster process 
were estimated by the minimum contrast estimation based on the pair correlation function (the 
non-cumulative counterpart of the L-function), whereas for the Strauss process the the logistic 
likelihood method ( jBaddeley et al.|20T4 1 was used. 

Then s = 1999 simulations of the fitted null model were done and the rank test with extreme 
rank count ordering was applied to each combination of test functions (L, F, G. J). For the 
sake of compactness, the tables in the following show only results for the combinations where 
the most commonly used function, the L-function, is involved. For each combination of test 
functions and each model, we calculated the proportion of rejections of the null model among 
the 1000 simulations (rejection if p < 0.05). 


5.1.2 Empirical type I error probabilities of the combined rank test for a simple null model 

First, we studied the type I error probabilities for the CSR, Strauss(350, 0.4,0.03) and MatClust(50, 
0.06, 4) models. The latter two models deviate stronly from CSR and they are similar to the null 
models used in the power study. The parameters of the null model were assumed to be known. 
All the estimated type I error probabilities are close to the nominal level a = 0.05, see Table 
[T] Indeed, for a = 0.05, the proportions of rejections should be in the interval (0.037, 0.064) 
with the probability 0.95 (given by the 2.5% and 97.5% quantiles of the binomial distribution 
with parameters 1000 and 0.05). Thus, we conclude that the rank test has correct empirical type 
I error probability for any combination of test functions. 


Table 1. Estimated type I error probabilities of the rank test with different combinations of test functions. 

The parameters of the null model are known. 


Simulated model 

L 

F 

G 

J 

L,F 

L,G 

L. ,J 

L, G, J 

L, F, G, J 

Poisson(200) 

0.047 

0.046 

0.049 

0.044 

0.049 

0.046 

0.044 

0.043 

0.043 

Strauss(350, 0.4, 0.03) 

0.047 

0.056 

0.045 

0.052 

0.051 

0.045 

0.052 

0.052 

0.056 

MatClust(50, 0.06, 4) 

0.064 

0.047 

0.057 

0.048 

0.056 

0.053 

0.050 

0.050 

0.051 


5.1.3 Effect of overfitting - type I error probabilities of the combined rank test for a com¬ 
posite null model 

Practically, the true parameters are unknown and have to be estimated. In such a case, the esti¬ 
mated type I errors are often appropriate for such test functions which are only loosely correlated 
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with the fitting procedure (Di ggle|2013[ Myllymaki et al.|2015b ). Based on our study, see Table 
[2j they are the L- and J-function for the test of complete spatial randomness and J for the Strauss 
process. For the Matern cluster process, the J-function is also less conservative than the other 
functions. The test function F seems to be very conservative in all cases. Table [2] further shows 
that combining functions that are only loosely correlated with the estimation procedure and and 
the ones that are highly correlated averages the level of conservativeness of the test. 


Table 2. Estimated type I error probabilities of the rank test with different combinations of test functions. 

The parameters of the null model are fitted. 


Simulated model 

L 

F 

G 

J 

L,F 

L,G 

L,J 

L, G, J 

L, F, G, J 

Poisson(200) 

0.054 

0.000 

0.022 

0.041 

0.045 

0.036 

0.042 

0.035 

0.033 

Strauss(350, 0.4, 0.03) 

0.025 

0 

0.015 

0.015 

0.019 

0.022 

0.021 

0.017 

0.017 

MatClust(50, 0.06, 4) 

0.008 

0.000 

0.014 

0.032 

0.005 

0.012 

0.022 

0.019 

0.021 


5.1.4 Rejection rates of the combined rank test 

The null and true models are shown in the two leftmost columns of Table [3] The parameters of 
the alternative models were chosen such that the deviation from the null model is moderate. One 
realization of each studied alternative model with its fitted null model are shown in Figure [6] for 
the illustration of closeness of null and true models. Table [3] also shows the obtained rejection 
rates of the rank test with various combinations of test functions. We observed the following: 

1. The combined test has only a bit lower power than the most powerful test statistic in all 
studied cases. 

2. Different single test functions can lead to very different powers (see e.g. line 5 of Table [3J . 

3. The last line of Table [3] shows that using a highly conservative test function (as L for the 
Matern cluster process) together with a less conservative test function (here J) can increase 
the power with respect to using only the less conservative test function. 

4. Finally, the last column of Table [3] shows that even by adding F-function, which has just 
very week power, the power decreases only little. 


Table 3. Estimated powers of the rank test with different combinations of test functions. The Strauss null 

model was fitted with R = 0.02. 


True model 

Null 

L 

F 

G 

J 

L,F 

L, G 

L, J 

L, G, J 

L, F, G, J 

Strauss(250, 0.6, 0.03) 

CSR 

0.622 

0.010 

0.428 

0.615 

0.591 

0.553 

0.608 

0.566 

0.553 

MatClust(200, 0.06, 1) 

CSR 

0.789 

0.208 

0.377 

0.573 

0.772 

0.744 

0.773 

0.747 

0.737 

Strauss(350, 0.4, 0.03) 

Strauss 

0.816 

0.012 

0.585 

0.721 

0.781 

0.746 

0.768 

0.712 

0.691 

MixMatClust 

MatClust 

0.000 

0.000 

0.949 

0.949 

0.000 

0.944 

0.944 

0.944 

0.944 

NoOMatClust 

MatClust 

0.537 

0.000 

0.185 

0.267 

0.490 

0.448 

0.488 

0.439 

0.424 
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This simulation study tells us, that we can generally construct a rank test which is sensitive 
to “all” possible alternative hypotheses by joining several test functions without worry of loosing 
the power of the test. 
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Figure 6. First and third lines show realizations of chosen true models, while second and fourth lines 
show its fitted null models. Mixed matern cluster model is superposition of MatClust(10, 0.06, 30) and 
MatClustflO, 0.03, 30). 
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6. SIMULTANEOUS GOODNESS-OF-FIT TEST FOR SEVERAL POINT PATTERNS 


Figure [Tjshows point patterns of entry and end points of epidermal nerve fibers (ENFs) that were 
previously analysed in |01sbo et al.| ( [20T3] ). The ENFs are thin nerve fibers living in the outmost 
layer of the skin called epidermis. While Kennedy et al.| ( [T996| ) reported diminished numbers of 
ENFs in subjects suffering from diabetic neuropathy, [Waller et al.| ({2011[) and Myllymaki et al. 


(2014) tried to quantify increased clustering of ENFs in such subjects based on spatial second- 


order analysis. Olsbo et al. (2013) further proposed preliminary point process models for the 
entry and end points based on data from thigh of healthy subjects. 

We tested the CSR hypothesis for the entry and end point patterns in Figure [7] as was done 
Olsbo et al.] ( 2013| as the first step in analysing the data sets. While in [Olsbo et al. (2013) 


in 


the CSR hypothesis was tested separately for each pattern by means of the refined envelope test 


proposed by Grabarnik et al. (2011), we now performed the test jointly for all the entry point 


patterns and for all the end point patterns. As [Olsbo et al. (2013), we used as the test function 
an estimator of the L-function with translational edge correction (see e.g. Illian et al. 112008 1 . We 
performed a two-sided rank envelope test on the interval of distances I = [ 0 , 80] (micrometers). 

The combined rank envelope test (s = 20000) rejects the CSR hypothesis both for the entry 
and end point patterns, see Figures [ 8 ] and [9] The reason of rejection for the entry points is the 
pattern of Subject 230. For end points, the rejection is due to the three subjects 224, 230 and 256. 
The same was in fact concluded in Olsbo et al. ( |2013J ). However, here we do only one test with a 
global type I error probability instead of four tests and provide a p- value for this combined test. 
The p-values based on extreme rank count ordering are 0.016 and 0.0036 for the entry and end 
points, respectively. 

The number of performed simulations s = 20000 is obviously large. We could have per¬ 
formed the test based on the extreme rank count ordering using a smaller number of simulations 
in order to have smaller computational load. For the entry points, the extreme rank count p -value 
obtained with s = 4999 is 0.0014, while the p-interval is (< 0.001, 0.040). For the end points, 
the corresponding p-value and -interval are 0.0022 and (< 0.001, 0.047). So, we in fact come to 
the same conclusions with s = 4999 as with s = 20000 in this case (figures not shown). 
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Figure 7. Epidermal nerve fiber patterns, where fibers are replaced by line segments connecting the end 
points (small black dots) and the entry points (black circles). Subjects: (a) 171, (b) 224, (c) 230 and (d) 
256. 
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Figure 8. Rank envelope test for testing CSR of the ENF entry point patterns in Figure]?] The number of 
simulations is s = 20000 and T(r) = L(r). 

Rank envelope test: p-interval = (0, 0.015) 
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Figure 9. Rank envelope test for testing CSR of the ENF end point patterns in Figure [ 7 ] The number of 
simulations is s = 20000 and T(r) = L(r). 


7. COMPARISON OF GROUPS OF POINT PATTERNS (ANOVA) 


In this section, we describe how the rank test can be used to compare groups of point patterns via 
a chosen test function which is computed for every point pattern. This task leads to a functional 
one way ANOVA problem, which was already solved by many authors. For example Cuevas 


et al. (2004 1 introduced asymptotic version of the ANOVA F-test, Ramsay and Silverman (2006) 


describe a bootstrap procedure based on pointwise F-tests, Abramovich and Angelini (2006) 


used wavelet smoothing techniques, Ferraty et al. (2007) used dimension reduction approach 


and Cuesta-Albertos and Febrero-Bande (2010) used several random univariate projections on 
which the F-test is applied and then the tests are bounded together through false discovery rate. 
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The last procedure was applied to a point pattern data of colorectal tumors in Aliy et al. (2013). 
There is also a possibility to transform the function into one number and apply a classical anova 
but such procedures can be blind against some alternatives. 

In the point patterns literature the group comparison is done either using functional anova 
as in [Aliy et al. (2013) or using a bootstrap procedure as in 


Diggle et al. (1991), Diggle et al. 


(2000) or Schladitz et al. ( 2003[ ). Furthermore, Hahn ( |2012 ) proposed a pure permutation pro¬ 
cedure to correct the inaccurate significance level of the bootstrap procedure. In these works the 
univariate statistic summarizing the overall differences between the groups is used and permuted 
or bootstrapped. 

In Section 7.1 we describe how the rank test can be used to solve the general ANOVA prob¬ 
lem. An advantage of our proposed rank method is the resulted graphical interpretation of the 
results: it directly identifies the distances which are responsible for the potential rejection. An¬ 
other advantage is that the rank test and, thus, the proposed functional ANOVA test is performed 
exactly with the desired significance level. 

In Section |T2| we investigate the possibility to use the rank test for determining the differences 
between groups of functions. The rank test can be used to determine which group differences 
and which distances are responsible for the possible rejection. 

To describe our approach we reanalyse the data of Diggle et al. ( 1991[ ) containing 3 groups of 
pyramidal neurons in the cingulate cortex of humans, the normal - control group, schizoaffective 
group and schizophrenic group. One representative pattern from each group can be seen in Figure 
[TO] (We discarded the point patterns with less than 20 points prior to the analysis, which led to 
12 point patterns in the normal group, 7 patterns in the schizoaffective group and 7 patterns in 
the schizophrenic group). 



Figure 10. One representative point pattern of each group of pyramidal neuron positions. 


As a test function we chose the estimator of the L function with the isotropic correction. 
Figure [YT] shows these estimated L-functions in the three groups. 
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normal 


schizoaffective 




schizophrenic 



Figure 11. The estimated centred L-functions in the three groups. 


7.1 Functional ANOVA 

In many functional ANOVA papers, the fact that the functions are measured only in a finite set of 
distances r is utilized. Because of this discretization, it is possible to apply any ANOVA analysis 
for every r-value separately and obtain K dependent values of an ANOVA statistic, where K 
stands for the number of distances r. The statistic can be for example F -value of the F-test, log 
likelihood, BIC or a statistic of a nonparametric test. Then the test vector used in the rank test is 


T = (F(r 1 ),F(r 2 ),...,F(r K )), 

where F(r-j) stands for the chosen univariate statistic. The simulations, which are necessary for 
applying the rank test, are produced by permuting the test functions. 

As an illustration of this method we computed the one sided rank test for the F -statistics from 
2499 permutations for the neuron data. The number of r values was set to 500. The weighted 
ANOVA was performed in order to deal with unequal group variances of F-functions, which 
arises from the unequal mean numbers of points in the point patterns in different groups. The 
Kruskal-Wallis test with the x 2 -statistic could be applied instead. 
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Rank envelope test: p-interval = (0.147, 0.152) 
Alternative = "greater" 



r 


Figure 12. Rank envelope test for comparison of 3 groups of /.-functions via /-’-statistic of the weighted 
ANOVA done with s = 2499 simulations. Solid line corresponds to the data /-’-function, the upper dashed 
line is the 95% upper envelope, the lower dashed line is 0 corresponding to lower envelope and the dotted 
line corresponds to the average of F functions from the permutations. 


The resulted 95% simultaneous upper envelope can be seen in Figure[l2]showing no rejection 
in any distances r. 

7.2 One way group comparison 

Below we describe a new functional ANOVA procedure which is also based on our combined 
rank test and which is directly able to identify which groups are different and which distances 
are responsible for the possible rejection. All that is done at the common and exact significance 
level a guaranteed by the rank test. 

Let us assume that we have J groups which contain m,... ,nj test functions which are 
estimated from m,..., n j point patterns and denote the test functions by T tJ . i = 1,... ,J,j = 
1, ..., rij. Assume that there exists not random functions //(r) and //, (/-) such that 

Tij{r) = n(r) + m(r) + = 1,..., J,j = 1,..., rij, 

where ejy(r) are i.i.d. sample from a distribution G(r) for every r. The only condition which 
G{r) has to satisfy is that it has mean zero and finite variance. Thus we are dealing with com¬ 
pletely nonparametric comparison of groups of functions. 
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We want to test the hypothesis Hq : 

Ho : m{r) = 0, i = 1,..., J. 

This hypothesis can be clearly tested by the rank test, if the test vector is taken to consist of the 
average of test functions in the first group followed by the average of test functions in the second 
group, etc. We can shortly write that 

T = (T 1 (r),T 2 (r),...,T J (r)), 

where Tj(r) = (Tj(ri),..., Tj(rf x '). Thus, the length of the test vector becomes J x K, where 
K stands for the number of distances r. The simulations, which are necessary for applying rank 
test, are again produced by permuting the test functions Tjj (r). 

The hypothesis Hq is equivalent to the hypothesis 

H'o ■ mir) - Hj (r) = 0, i = 1,..., J — l,j = i,..., J. 

This hypothesis corresponds to the post-hoc test done usually after the ANOVA test is significant. 
However, this hypothesis can be directly tested by the combined rank test, if the test vector is taken 
to consist of differences of the group averages of test functions. We can shortly write that 

T' = (Ti(r) - T 2 (r),Tr(r) - T 3 (r),... .Tj-ito - Tj{ r)). 

Here the length of the test vector becomes J( J — l)/2 x K. 

Recall that both tests described above are done at one common significance level a, which 
means that it is not necessary to perform the ANOVA test prior to the post-hoc test. Instead it is 
possible to apply only the post-hoc test obtaining an answer about the overall ANOVA test and 
also about the differences of groups. Note that the two tests test the same hypothesis Ho but the 
tests are not the same, they are sensitive to different departures from Ho- 

7.2.1 Correction for an unequal variances for testing H' 0 

The two above procedure can be applied if the variances are equal across the groups of functions. 
To deal with different variances of group means of test functions in the different permutations, 
we rescale them to unit variance. Then the test vector becomes 

Ti(r)-n(r) Tj-i(r) — Tj{r) 

y/var(Ti(r)) + Var (T 2 (r)) ^Var(T/_i(r)) + Var(Tj(r)) 

In practice, Var(Ti (r)) must be estimated for each r. For small samples, the sample variance es¬ 
timator can have big variance, which may influence the procedure. The variance can be smoothed 
by applying moving average to the estimated variance with a chosen window size b and replacing 
the sample variance in Q by its moving average analogue, 

T / = ( Ti{r)-%{r) Tj^(r)-T](r) 

V ^MAt CVai-(7T (r))) + MA b (Var(T 2 (r))) ’ ’ ^MA^VarCTj.^r))) + MA 6 (Var(Tj(r))) 
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7.2.2 Weighted average 


Finally, instead of the basic average of functions T it is possible to use weighted average of 
functions in order to decrease the variance of group average and to give more weight to those 
functions that are more trustworthy. For example in the case of point pattern comparison, it is 
possible to follow Diggle et al. (2000) and apply the weighted average on L-functions, where 
the weigths correspond to the number of points in the point pattern. In fact, the variance of the 
estimated /.-function behaves approximately as 1 /rriij, where rn tJ is the number of points in ij 
point pattern. Thus, such weighted average can decrease the group variability of L-functions and 
improve the procedure. Thus for point patterns, the weighted average is defined as 


L i ( r ) = X] — L ij( r) 

no¬ 


where m, = V'7 | rriij. The variance of weighted average, 


m. 


var 


(T/(r)) = y var (-Tjj,(r)), 
mf 

3 1 


has to be then used in the T 2 . The variance Varf L, ? (r)) can be estimated as above. 

Figure [13] shows the result of the comparison of 3 groups of point patterns via difference of 
group weighted averages when T 2 was used as test vector. The number of r values was set to 500 
and the window size of the moving average was set to 75. Each subplot shows the comparison of 
2 groups. The test statistic being positive corresponds to the situation that the first group is more 
clustered than the second group in the comparison. Our result shows no differences between 
groups similarly as in the originally study and as the functional ANOVA test shown in Figure [T2| 
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Figure 13. Rank envelope test for comparison of 3 groups of L-functions via difference of group weighted 
averages done with s = 7500 simulations. The left subplot corresponds to the difference between the first 
and second group, the middle subplot corresponds to the difference between the first and third group 
and the right subplot corresponds to the difference between the second and third group. The grey area 
represents the 95% global envelope. 
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We can observe that the first and second group are rather similar (the first subplot). Therefore 
we join the first and second group as was done in the original study. The result of the comparison 
of first and second group with third group is shown in Figure [14} 

Rank envelope test: p-interval = (0.067, 0.069) 
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Figure 14. Rank envelope test for comparison of 2 groups of /.-functions via difference of group averages 
done with s = 2500 simulations. The plot corresponds to the difference between the joined first and 
second group and the third group. The grey area represents the 95% global envelope. 


Here we also do not observe significant difference between the first and second group with 
respect to the third group at the significance level 0.05. Note that the original study reported a 
p- value of similar size, but Hahn (2012[> found out that the original method was liberal. 


7.2.3 Correction for an unequal variances for testing H 0 

If many groups should be compared the above test based on group differences will consist of 
many subtests and the power can be small. In such a case it is possible to return to the test of 
hypothesis Hq which consists of fewer subtests. To account for unequal variances in this test it 
is possible to set the test vector of the rank test as: 

T 2 =( Ti(r)-T_i(r) ^ Tj(r)-T_j(r) 

V ^MAb C Var(7T(r))) + MA b (Var(T_ j (r))) ’ ’ ^MA 6 (Var(Tj(r))) + MA b (Var(T_j(r))) 

where T_j denotes the average of all test functions without the test function of the z-th group. 
Also the weighted average can be also applied in the same way as above. 

Figure [T3] shows the comparison of 3 groups by means of Tj with application of weighted 
average. Each subplot shows the comparison of a group with respect to the rest of groups. The 
test statistic being positive corresponds to the situation that the group is more clustered than the 
rest of groups. Our result again shows no differences between groups. 
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Figure 15. Rank envelope test for comparison of three groups of L-functions via difference of the averages 
of a group and the rest of the groups done with s = 7500 simulations. The left subplot corresponds to 
the difference of the first group with respect to the remaining groups, the middle subplot corresponds to 
the second group and the right subplot corresponds to the third group. The grey area represents the 95% 
global envelope. 


8. TEST FOR DEPENDENCE OF COMPONENTS IN MULTI-TYPE POINT 

PROCESSES 


The random superposition hypothesis for a bivariate point process says that the point process 
results from the union of two independent components. Testing of such independence between 
two types of points is typically based on the bivariate L\ 2 (r) function. Simulations under this 
hypothesis are obtained by keeping the points of type 1 fixed, and shifting the points of type 2 
with periodic boundary conditions (see e.g. Illian et al. 20081. 

For testing independence between n > 2 types of points, to best of our knowledge, there 
are no “multivariate” L functions available. Thus, a typical way to test the independence of 
n > 2 sub-point patterns is by going through all the pairs of types and performing consequently 
n(n — l)/2 tests, where the test of independence of points of type i and j is based on a bivariate 
Lj 3 function ( i,j £ {1,..., n}). By means of the rank test, we can combine these tests to one 
test, i.e. to a test for the random superposition of n > 2 components. 

We demonstrate this test for the patterns of the four richest tree species in an area of size 100 
m xlOO m in a tropical rainforest at Barro Colorado Island, Panama, see Figure [16] The data 
origins from a 50 ha Forest Dynamics Plot in 2005, see Hubbell et al. ( 20051 ), [Condit| ( |T998) and 


Hubbell et al. (1999). 


Thus, we would like to test whether there are any small scale interactions between these 
species. For this purpose, we used the L,j(r) functions on / = [1, 25] and the multiple rank 
test with six sub-tests, (z,j) £ {(1, 2), (1, 3), (1,4), (2, 3), (2,4), (3,4)}. For computational 
reasons, we used only s = 4999 simulations and calculated the extreme rank count p- value. The 
obtained joint p-value of the test is 0.3416, while the obtained p-interval is (0.3388,0.3516), 
thus giving the same test result “no rejection”. Thus, according to this test, we have no evidence 
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against the random superposition of the patterns of the four individual species. 

DES2PA FARAOC 




HYBAPR 


TRI2TU 




Figure 16. Four rainforest species in an area of size 100 m xlOO m. DES2PA: Desmopsis panamensis; 
FARAOC: Faramea occidentalis, HYBAPR: Hybanthus prunifolius; TRI2TU: Trichilia tuberculata. 


9. DISCUSSION 

In this paper, we have shown many possible applications of the rank (envelope) test for correction 
of multiple testing problem. The rank (envelope) test can be seen as a general solution to multiple 
Monte Carlo tests. We have shown how the rank test can be used to perform a combined test for 
several univariate Monte Carlo tests, a combined test for pointwise Monte Carlo tests with a test 
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function T(r),r G I and a combined test for several rank envelope tests performed with various 
test functions. The third case allows many interesting applications. 

We considered the goodness-of-fit test with several test functions for a point pattern. In this 
case we have shown that using more test functions decreases the power of the combined test just a 
little in comparison to using the "best" function. Therefore a whole range of test functions can be 
used so that the test is sensitive to "all" possible deviations from the null model. Our simulation 
study also shows that using a powerless test function in the set of test functions does not decrease 
the power of the combined test much. 

We also employed the goodness-of-fit test for several point patterns simultaneously. This 
problem may also be solved by computing the average test function over the point patterns and 
by comparing it with its simulated counterparts. Since the combined rank test compares each 
point pattern with the null model individually (but simultaneously at the global level a), we 
believe it can lead to higher power than the other approach if the point patterns deviate from the 
null model in different ways. 

We considered comparison of several groups of point patterns. Since a test function is used 
in the test instead of a point pattern, this test can be applied to any functional data and it can be 
seen as a functional ANOVA. Usually the test function is summarized into one number and then 
the classical ANOVA is applied or a bootstrap method is applied. In our suggested approaches 
the whole test function is used and graphical interpretation shows which distances of the test 
function lead to a potential rejection. Additionally, in the second suggested approach, it is also 
seen which group leads to the possible rejection. This second approach can be seen as a post hoc 
test, which is performed at the exact significant level. This can be seen as an advantage also with 
respect to the classical ANOVA, because after an ANOVA test one has to perform a further post 
hoc test in order to find out between which groups there is a difference. Such post hoc tests are a 
bit conservative with respect to original ANOVA test, whereas in our approach the post hoc test 
is performed on the exact significance level. (The null hypothesis is simple in this case.) 

Finally, we have applied the combined rank envelope test to the test of dependence of compo¬ 
nents in a multi-type point pattern with more than two components. Since performing the rank 
envelope test with many subtest (for many components) needs many simulations, we showed 
here the possibility of using a lower number of simulation together with our solution for ties (the 
extreme rank count ordering). 

The aim of this paper was to show possible applications of the combined rank envelope test 
and its advantages. We are sure that this is not an exhausting list of the applications. There 
are further applications, e.g., in the fields of functional data analysis, geostatistics or random set 
theory. 
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